Scaling the ISLE Framework: Use of Existing Corpus Resources for Validation of MT Evaluation Metrics across Languages

نویسندگان

  • Michelle Vanni
  • Keith J. Miller
چکیده

This paper describes a machine translation (MT) evaluation (MTE) research program which has benefited from the availability of two collections of source language texts and the results of processing these texts with several commercial MT engines (DARPA 1994, Doyon, Taylor, & White 1999). The methodology entails the systematic development of a predic tive relationship between discrete, well-defined MTE metrics and specific information processing tasks that can be reliably performed with output of a given MT system. Unlike tests used in initial experiments on automated scoring (Jones and Rusk 2000), we employ traditional measures of MT output quality, selected from the International Standards for Language Engineering (ISLE) framework: Coherence, Clarity, Syntax, Morphology, General and Domain-specific Lexical robustness, to include Named-entity translation. Each test was originally validated on MT output produced by three Spanish-to-English systems (1994 DARPA MTE). We validate tests in the present work, however, with material taken from the MT Scale Evaluation research program produced by Japanese-to-English MT systems. Since Spanish and Japanese differ structurally on the morphological, syntactic, and discourse levels, a comparison of scores on tests measuring these output qualities should reveal how structural similarity, such as that enjoyed by Spanish and English, and structural contrast, such as that found between Japanese and English, affect the linguistic distinctions which must be accommodated by MT systems. Moreover, we show that metrics developed using Spanish-English MT output are equally effective when applied to Japanese-English MT output.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scaling the ISLE Taxonomy: Development of Metrics for the Multi-Dimensional Characterisation of Machine Translation Quality

The DARPA MT evaluations of the early 1990s, along with subsequent work on the MT Scale, and the International Standards for Language Engineering (ISLE) MT Evaluation framework represent two of the principal efforts in Machine Translation Evaluation (MTE) over the past decade. We describe a research program that builds on both of these efforts. This paper focuses on the selection of MT output f...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Scaling the ISLE Framework: Validating Tests of Machine Translation Quality for Multi-Dimensional Measurement

Work on comparing a set of linguistic test scores for MT output to a set of the same tests’ scores for naturally-occurring target language text (Jones and Rusk 2000) broke new ground in automating MT Evaluation. However, the tests used were selected on an ad hoc basis. In this paper, we report on work to extend our understanding, through refinement and validation, of suitable linguistic tests i...

متن کامل

The FEMTI guidelines for contextual MT evaluation: principles and resources

A large number of evaluation metrics exist for machine translation (MT) systems, but depending on the intended context of use of such a system, not all metrics are equally relevant. Based on the ISO/IEC 9126 and 14598 standards for software evaluation, the Framework for the Evaluation of Machine Translation in ISLE (FEMTI) provides guidelines for the selection of quality characteristics to be e...

متن کامل

Building a Comprehensive Conceptual Framework for Power Systems Resilience Metrics

Recently, the frequency and severity of natural and man-made disasters (extreme events), which have a high-impact low-frequency (HILF) property, are increased. These disasters can lead to extensive outages, damages, and costs in electric power systems. A power system must be built with “resilience” against disasters, which means its ability to withstand disasters efficiently while ensuring the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002